Importing Libraries used for this analysis
library(lubridate)
##
## Attaching package: 'lubridate'
## The following objects are masked from 'package:base':
##
## date, intersect, setdiff, union
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
library(ggplot2)
library(rlang)
library(gridExtra)
##
## Attaching package: 'gridExtra'
## The following object is masked from 'package:dplyr':
##
## combine
library(ggthemes)
library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ forcats 1.0.0 ✔ stringr 1.5.0
## ✔ purrr 1.0.2 ✔ tibble 3.2.1
## ✔ readr 2.1.4 ✔ tidyr 1.3.0
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ purrr::%@%() masks rlang::%@%()
## ✖ gridExtra::combine() masks dplyr::combine()
## ✖ dplyr::filter() masks stats::filter()
## ✖ purrr::flatten() masks rlang::flatten()
## ✖ purrr::flatten_chr() masks rlang::flatten_chr()
## ✖ purrr::flatten_dbl() masks rlang::flatten_dbl()
## ✖ purrr::flatten_int() masks rlang::flatten_int()
## ✖ purrr::flatten_lgl() masks rlang::flatten_lgl()
## ✖ purrr::flatten_raw() masks rlang::flatten_raw()
## ✖ purrr::invoke() masks rlang::invoke()
## ✖ dplyr::lag() masks stats::lag()
## ✖ purrr::splice() masks rlang::splice()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(reshape2)
##
## Attaching package: 'reshape2'
##
## The following object is masked from 'package:tidyr':
##
## smiths
library(corrplot)
## corrplot 0.92 loaded
loading the Dataset
Cleaned_bitcoin_mining <- read.csv("Cleaned_bitcoin_mining.csv")
head(Cleaned_bitcoin_mining)
## Date.and.Time power.MAX..GW power.MIN..GW power.GUESS..GW
## 1 2010-07-18T00:00:00 2.67e-05 2.24e-05 2.44e-05
## 2 2010-07-19T00:00:00 2.68e-05 2.26e-05 2.46e-05
## 3 2010-07-20T00:00:00 2.72e-05 2.29e-05 2.50e-05
## 4 2010-07-21T00:00:00 2.84e-05 2.39e-05 2.61e-05
## 5 2010-07-22T00:00:00 2.82e-05 2.37e-05 2.59e-05
## 6 2010-07-23T00:00:00 2.85e-05 2.40e-05 2.61e-05
## annualised.consumption.MAX..TWh annualised.consumption.MIN..TWh
## 1 0.000233717 0.000196712
## 2 0.000235075 0.000197855
## 3 0.000238699 0.000200905
## 4 0.000249343 0.000209864
## 5 0.000247305 0.000208148
## 6 0.000250023 0.000210436
## annualised.consumption.GUESS..TWh Lower.bound.efficiency..J.Th
## 1 0.000214241 14313700
## 2 0.000215486 14313700
## 3 0.000218808 14313700
## 4 0.000228565 14313700
## 5 0.000226696 14313700
## 6 0.000229188 14313700
## Estimated.efficiency..J.Th Upper.bound.efficiency..J.Th Hydro.only..MtCO2e
## 1 14313700 14313700 4e-06
## 2 14313700 14313700 5e-06
## 3 14313700 14313700 5e-06
## 4 14313700 14313700 5e-06
## 5 14313700 14313700 5e-06
## 6 14313700 14313700 5e-06
## Estimated..MtCO2e Coal.only..MtCO2e Emission.intensity..gCO2e.kWh
## 1 0.000119 0.000214 554.1215
## 2 0.000119 0.000216 554.1215
## 3 0.000121 0.000219 554.1215
## 4 0.000127 0.000229 554.1215
## 5 0.000126 0.000227 554.1215
## 6 0.000127 0.000229 554.1215
## Hash.rate.MH.s
## 1 0.001606373
## 2 0.001822962
## 3 0.001822962
## 4 0.001750766
## 5 0.001669545
## 6 0.001669545
Checking the dimension and Structure of data
dim(Cleaned_bitcoin_mining)
## [1] 4815 15
str(Cleaned_bitcoin_mining)
## 'data.frame': 4815 obs. of 15 variables:
## $ Date.and.Time : chr "2010-07-18T00:00:00" "2010-07-19T00:00:00" "2010-07-20T00:00:00" "2010-07-21T00:00:00" ...
## $ power.MAX..GW : num 2.67e-05 2.68e-05 2.72e-05 2.84e-05 2.82e-05 2.85e-05 2.86e-05 2.99e-05 3.15e-05 3.23e-05 ...
## $ power.MIN..GW : num 2.24e-05 2.26e-05 2.29e-05 2.39e-05 2.37e-05 2.40e-05 2.41e-05 2.52e-05 2.65e-05 2.72e-05 ...
## $ power.GUESS..GW : num 2.44e-05 2.46e-05 2.50e-05 2.61e-05 2.59e-05 2.61e-05 2.62e-05 2.74e-05 2.88e-05 2.96e-05 ...
## $ annualised.consumption.MAX..TWh : num 0.000234 0.000235 0.000239 0.000249 0.000247 ...
## $ annualised.consumption.MIN..TWh : num 0.000197 0.000198 0.000201 0.00021 0.000208 ...
## $ annualised.consumption.GUESS..TWh: num 0.000214 0.000215 0.000219 0.000229 0.000227 ...
## $ Lower.bound.efficiency..J.Th : num 14313700 14313700 14313700 14313700 14313700 ...
## $ Estimated.efficiency..J.Th : num 14313700 14313700 14313700 14313700 14313700 ...
## $ Upper.bound.efficiency..J.Th : num 14313700 14313700 14313700 14313700 14313700 ...
## $ Hydro.only..MtCO2e : num 4e-06 5e-06 5e-06 5e-06 5e-06 5e-06 5e-06 5e-06 5e-06 5e-06 ...
## $ Estimated..MtCO2e : num 0.000119 0.000119 0.000121 0.000127 0.000126 0.000127 0.000127 0.000133 0.00014 0.000144 ...
## $ Coal.only..MtCO2e : num 0.000214 0.000216 0.000219 0.000229 0.000227 0.000229 0.00023 0.000241 0.000253 0.00026 ...
## $ Emission.intensity..gCO2e.kWh : num 554 554 554 554 554 ...
## $ Hash.rate.MH.s : num 0.00161 0.00182 0.00182 0.00175 0.00167 ...
Our Dataset contains 4,815 observations(rows) and 15
variables(columns). The structure of the bitcoin mining dataset reveals
information related to power consumption, efficiency, CO2 emissions, and
hash rates.
Summary Statistics
summary(Cleaned_bitcoin_mining)
## Date.and.Time power.MAX..GW power.MIN..GW power.GUESS..GW
## Length:4815 Min. : 0.00003 Min. :0.000022 Min. : 0.000024
## Class :character 1st Qu.: 0.39179 1st Qu.:0.031152 1st Qu.: 0.154086
## Mode :character Median : 2.12457 Median :0.384142 Median : 0.905217
## Mean : 9.82974 Mean :2.039373 Mean : 3.989582
## 3rd Qu.:15.41883 3rd Qu.:4.049493 3rd Qu.: 7.710647
## Max. :56.01570 Max. :8.947454 Max. :15.063222
## annualised.consumption.MAX..TWh annualised.consumption.MIN..TWh
## Min. : 0.0002 Min. : 0.0002
## 1st Qu.: 3.4344 1st Qu.: 0.2731
## Median : 18.6240 Median : 3.3674
## Mean : 86.1675 Mean :17.8771
## 3rd Qu.:135.1615 3rd Qu.:35.4978
## Max. :491.0337 Max. :78.4334
## annualised.consumption.GUESS..TWh Lower.bound.efficiency..J.Th
## Min. : 0.00021 Min. : 21
## 1st Qu.: 1.35072 1st Qu.: 38
## Median : 7.93513 Median : 98
## Mean : 34.97267 Mean : 458086
## 3rd Qu.: 67.59153 3rd Qu.: 9917
## Max. :132.04420 Max. :14313700
## Estimated.efficiency..J.Th Upper.bound.efficiency..J.Th Hydro.only..MtCO2e
## Min. : 31 Min. : 46 Min. :0.000004
## 1st Qu.: 68 1st Qu.: 167 1st Qu.:0.028365
## Median : 261 Median : 766 Median :0.166638
## Mean : 771891 Mean : 1292594 Mean :0.734426
## 3rd Qu.: 36553 3rd Qu.: 75000 3rd Qu.:1.419422
## Max. :14313700 Max. :14313700 Max. :2.772928
## Estimated..MtCO2e Coal.only..MtCO2e Emission.intensity..gCO2e.kWh
## Min. : 0.00012 Min. : 0.00021 Min. :359.5
## 1st Qu.: 0.75628 1st Qu.: 1.35207 1st Qu.:512.8
## Median : 4.22858 Median : 7.94307 Median :533.7
## Mean :17.95686 Mean : 35.00765 Mean :532.2
## 3rd Qu.:31.96006 3rd Qu.: 67.65912 3rd Qu.:559.0
## Max. :66.90830 Max. :132.17625 Max. :594.6
## Hash.rate.MH.s
## Min. : 0
## 1st Qu.: 3838
## Median : 3210303
## Mean : 64397862
## 3rd Qu.:111495251
## Max. :506061817
From the summary Statistics, we can sense the distribution, central
tendency and range of each variable, as well as the presence of missing
values.
Data cleaning
Checking for missing values
sum(is.na(Cleaned_bitcoin_mining))
## [1] 0
There are No missing values as this is the Cleaned dataset and Every
column has complete data for all the rows.
Checking number of Unique values
sapply(Cleaned_bitcoin_mining, function(x) length(unique(x)))
## Date.and.Time power.MAX..GW
## 4815 4767
## power.MIN..GW power.GUESS..GW
## 4745 4771
## annualised.consumption.MAX..TWh annualised.consumption.MIN..TWh
## 4771 4750
## annualised.consumption.GUESS..TWh Lower.bound.efficiency..J.Th
## 4774 24
## Estimated.efficiency..J.Th Upper.bound.efficiency..J.Th
## 275 44
## Hydro.only..MtCO2e Estimated..MtCO2e
## 4543 4757
## Coal.only..MtCO2e Emission.intensity..gCO2e.kWh
## 4761 39
## Hash.rate.MH.s
## 3801
Date and time has 4815 unique values which means that each row
corresponds to a unique timestamp. Most of the columns have a large
number of unique values, suggesting continous data, but few columns like
” lower Bound eficiency, J/th”, “Upper bound efficiency, J/th”, and
“Emission intensity, gCO2e/kWh” have fewer values, indicating potential
categories or repeated measurements.
Univariate Analysis - Analyzing one variable at a time
Histograms- Histograms will give insights into the distribution of
continuous variables and helps us to understand the central the central
tendency, spread, and shape of the dataset’s distribution
variables <- c('power.GUESS..GW', 'annualised.consumption.GUESS..TWh', 'Estimated.efficiency..J.Th',
'Hydro.only..MtCO2e', 'Estimated..MtCO2e', 'Coal.only..MtCO2e',
'Emission.intensity..gCO2e.kWh', 'Hash.rate.MH.s')
var_names <- c('Power (GW)', 'Annualised Consumption (TWh)', 'Estimated Efficiency (J/Th)',
'Hydro Only Emissions (MtCO2e)', 'Estimated Emissions (MtCO2e)', 'Coal Only Emissions (MtCO2e)',
'Emission Intensity (gCO2e/kWh)', 'Hash Rate (MH/s)')
df_long <- Cleaned_bitcoin_mining %>%
select(all_of(variables)) %>%
pivot_longer(cols = everything(), names_to = "Variable", values_to = "Value")
df_long$Variable <- factor(df_long$Variable, levels = variables, labels = var_names)
p <- ggplot(df_long, aes(x = Value)) +
geom_histogram(aes(y = ..count..), fill = '#66c2a5', color = '#004d40', bins = 30) +
geom_freqpoly(color = "#e34a33", size = 1) +
facet_wrap(~ Variable, scales = "free", ncol = 2) +
theme_minimal() +
labs(title = "Histograms of Selected Variables", y = "Frequency") +
theme(axis.text.x = element_text(angle = 45, hjust = 1))
## Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
## ℹ Please use `linewidth` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
print(p)
## Warning: The dot-dot notation (`..count..`) was deprecated in ggplot2 3.4.0.
## ℹ Please use `after_stat(count)` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

for(i in 1:length(variables)) {
df_subset <- df_long[df_long$Variable == var_names[i], ]
p <- ggplot(df_subset, aes(x = Value)) +
geom_histogram(aes(y = ..count..), fill = '#66c2a5', color = '#004d40', bins = 30) +
geom_freqpoly(color = "#e34a33", size = 1) +
labs(title = paste("Histogram of", var_names[i]), y = "Frequency") +
theme_minimal() +
theme(axis.text.x = element_text(angle = 45, hjust = 1))
print(p)
}
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Outliers
IQR
variables <- c('power.GUESS..GW', 'annualised.consumption.GUESS..TWh', 'Estimated.efficiency..J.Th',
'Hydro.only..MtCO2e', 'Estimated..MtCO2e', 'Coal.only..MtCO2e',
'Emission.intensity..gCO2e.kWh', 'Hash.rate.MH.s')
# sapply function is used to apply a function to each variable in the 'variables'
outliers_counts <- sapply(variables, function(var) {
Q1 <- quantile(Cleaned_bitcoin_mining[[var]], 0.25)
Q3 <- quantile(Cleaned_bitcoin_mining[[var]], 0.75)
IQR <- Q3 - Q1
lower_bound <- Q1 - 1.5 * IQR
upper_bound <- Q3 + 1.5 * IQR
outliers <- Cleaned_bitcoin_mining[[var]][Cleaned_bitcoin_mining[[var]] < lower_bound |
Cleaned_bitcoin_mining[[var]] > upper_bound]
length(outliers)
})
names(outliers_counts) <- variables
outliers_counts
## power.GUESS..GW annualised.consumption.GUESS..TWh
## 0 0
## Estimated.efficiency..J.Th Hydro.only..MtCO2e
## 1097 0
## Estimated..MtCO2e Coal.only..MtCO2e
## 0 0
## Emission.intensity..gCO2e.kWh Hash.rate.MH.s
## 214 254
Bitcoin’s popularity, mining difficulty, and technology have evolved
over time. Extreme values in recent years might reflect genuine shifts
in the ecosystem and whereas early outliers might indicate data sparsity
or other anomalies.
Cap/Floor Outliers- Instead of removing the outliers, we can cap
them.
If we feel like the extreme values are genuine or not errors which
influences the analysis, we acn consider capping them at a threshold
like the lower and upper bound determined by the IQR method as this
retains the data but reduces the skewness.
For example, any value below the lower bound can be set to the lower
bound value and similar for the upper bound and this approach retains
the outliers.
Cleaned_bitcoin_mining_copy <- Cleaned_bitcoin_mining
for(var in variables) {
Q1 <- quantile(Cleaned_bitcoin_mining_copy[[var]], 0.25)
Q3 <- quantile(Cleaned_bitcoin_mining_copy[[var]], 0.75)
IQR <- Q3 - Q1
lower_bound <- Q1 - 1.5 * IQR
upper_bound <- Q3 + 1.5 * IQR
Cleaned_bitcoin_mining_copy[[var]] <- ifelse(Cleaned_bitcoin_mining_copy[[var]] < lower_bound, lower_bound,
ifelse(Cleaned_bitcoin_mining_copy[[var]] > upper_bound, upper_bound,
Cleaned_bitcoin_mining_copy[[var]]))
}
summary(Cleaned_bitcoin_mining_copy[variables])
## power.GUESS..GW annualised.consumption.GUESS..TWh
## Min. : 0.000024 Min. : 0.00021
## 1st Qu.: 0.154086 1st Qu.: 1.35072
## Median : 0.905217 Median : 7.93513
## Mean : 3.989582 Mean : 34.97267
## 3rd Qu.: 7.710647 3rd Qu.: 67.59153
## Max. :15.063222 Max. :132.04420
## Estimated.efficiency..J.Th Hydro.only..MtCO2e Estimated..MtCO2e
## Min. : 31.13 Min. :0.000004 Min. : 0.00012
## 1st Qu.: 67.72 1st Qu.:0.028365 1st Qu.: 0.75628
## Median : 260.92 Median :0.166638 Median : 4.22858
## Mean :23180.17 Mean :0.734426 Mean :17.95686
## 3rd Qu.:36553.00 3rd Qu.:1.419422 3rd Qu.:31.96006
## Max. :91280.91 Max. :2.772928 Max. :66.90830
## Coal.only..MtCO2e Emission.intensity..gCO2e.kWh Hash.rate.MH.s
## Min. : 0.00021 Min. :443.5 Min. : 0
## 1st Qu.: 1.35207 1st Qu.:512.8 1st Qu.: 3838
## Median : 7.94307 Median :533.7 Median : 3210303
## Mean : 35.00765 Mean :534.2 Mean : 60413666
## 3rd Qu.: 67.65912 3rd Qu.:559.0 3rd Qu.:111495251
## Max. :132.17625 Max. :594.6 Max. :278732371
df_long_capped <- Cleaned_bitcoin_mining_copy %>%
select(all_of(variables)) %>%
pivot_longer(cols = everything(), names_to = "Variable", values_to = "Value")
df_long_capped$Variable <- factor(df_long_capped$Variable, levels = variables, labels = var_names)
p_capped <- ggplot(df_long_capped, aes(x = Value)) +
geom_histogram(aes(y = ..count..), fill = '#66c2a5', color = '#004d40', bins = 30) +
geom_freqpoly(color = "#e34a33", size = 1) +
facet_wrap(~ Variable, scales = "free", ncol = 2) +
theme_minimal() +
labs(title = "Histograms of Capped Variables", y = "Frequency") +
theme(axis.text.x = element_text(angle = 45, hjust = 1))
print(p_capped)
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

for(i in 1:length(variables)) {
df_subset <- df_long_capped[df_long_capped$Variable == var_names[i], ]
p <- ggplot(df_subset, aes(x = Value)) +
geom_histogram(aes(y = ..count..), fill = '#66c2a5', color = '#004d40', bins = 30) +
geom_freqpoly(color = "#e34a33", size = 1) +
labs(title = paste("Capped Histogram of", var_names[i]), y = "Frequency") +
theme_minimal() +
theme(axis.text.x = element_text(angle = 45, hjust = 1))
print(p)
}
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

After capping and flooring, the extreme values in the data were
limited to a more standadized range. The capped/floored data likely
still retains its right- skewed nature for many variables.
The spread of data might appear more compact now without the long
tails that were previously present due to outliers.
Segmentation Analysis- It’s a method used to divide a data set into
subsets (with outliers(original data) & without outliers)
Bi-variate Analysis- Analzing the relationship between two or more
variables.
Corelation Matrix
cor_matrix <- cor(Cleaned_bitcoin_mining[variables], use = "complete.obs")
col <- colorRampPalette(c("#BB4444", "#EE9988", "#FFFFFF", "#77AADD", "#4477AA"))
corrplot(cor_matrix, method = "color", type = "upper",
col = col(200),
tl.col = "black",
tl.srt = 90,
order = "hclust",
addCoef.col = "black",
number.cex = 0.5,
title = "Correlation Matrix", mar=c(0,0,1,0))

Highly Correlated Variables :
Sample T-test to compare the Power.Guess..GW before and after jan
1st 2013
The T-test is used to deterrmine if there is a statististically
significant difference between the means of two groups.
before_2013 <- subset(Cleaned_bitcoin_mining, Date.and.Time < as.Date("2013-01-03"))
after_2013 <- subset(Cleaned_bitcoin_mining, Date.and.Time >= as.Date("2013-01-03"))
t_result <- t.test(before_2013$power.GUESS..GW, after_2013$power.GUESS..GW)
print(t_result)
##
## Welch Two Sample t-test
##
## data: before_2013$power.GUESS..GW and after_2013$power.GUESS..GW
## t = -66.997, df = 3914.4, p-value < 2.2e-16
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -5.036382 -4.749997
## sample estimates:
## mean of x mean of y
## 0.01100724 4.90419668
T-test for Selected Variables
results <- list()
for(var in variables) {
if(any(is.na(before_2013[[var]])) || any(is.na(after_2013[[var]]))) {
results[[var]] <- "Contains NA values"
} else if(length(unique(before_2013[[var]])) == 1 || length(unique(after_2013[[var]])) == 1) {
results[[var]] <- "Constant values in one or both periods"
} else {
result <- t.test(before_2013[[var]], after_2013[[var]])
results[[var]] <- result
}
}
for(var in variables) {
cat("T-test results for", var, ":\n")
print(results[[var]])
cat("\n---------------------------------------------\n")
}
## T-test results for power.GUESS..GW :
##
## Welch Two Sample t-test
##
## data: before_2013[[var]] and after_2013[[var]]
## t = -66.997, df = 3914.4, p-value < 2.2e-16
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -5.036382 -4.749997
## sample estimates:
## mean of x mean of y
## 0.01100724 4.90419668
##
##
## ---------------------------------------------
## T-test results for annualised.consumption.GUESS..TWh :
##
## Welch Two Sample t-test
##
## data: before_2013[[var]] and after_2013[[var]]
## t = -66.997, df = 3914.4, p-value < 2.2e-16
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -44.14892 -41.63848
## sample estimates:
## mean of x mean of y
## 0.09648949 42.99018811
##
##
## ---------------------------------------------
## T-test results for Estimated.efficiency..J.Th :
##
## Welch Two Sample t-test
##
## data: before_2013[[var]] and after_2013[[var]]
## t = 25.546, df = 899.31, p-value < 2.2e-16
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 3659260 4268296
## sample estimates:
## mean of x mean of y
## 3994776.19 30998.28
##
##
## ---------------------------------------------
## T-test results for Hydro.only..MtCO2e :
##
## Welch Two Sample t-test
##
## data: before_2013[[var]] and after_2013[[var]]
## t = -66.997, df = 3914.4, p-value < 2.2e-16
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -0.9271274 -0.8744080
## sample estimates:
## mean of x mean of y
## 0.002026276 0.902793953
##
##
## ---------------------------------------------
## T-test results for Estimated..MtCO2e :
##
## Welch Two Sample t-test
##
## data: before_2013[[var]] and after_2013[[var]]
## t = -66.447, df = 3914.4, p-value < 2.2e-16
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -22.66799 -21.36865
## sample estimates:
## mean of x mean of y
## 0.05411622 22.07243658
##
##
## ---------------------------------------------
## T-test results for Coal.only..MtCO2e :
##
## Welch Two Sample t-test
##
## data: before_2013[[var]] and after_2013[[var]]
## t = -66.997, df = 3914.4, p-value < 2.2e-16
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -44.19307 -41.68011
## sample estimates:
## mean of x mean of y
## 0.09658598 43.03317831
##
##
## ---------------------------------------------
## T-test results for Emission.intensity..gCO2e.kWh :
##
## Welch Two Sample t-test
##
## data: before_2013[[var]] and after_2013[[var]]
## t = 53.295, df = 4099.5, p-value < 2.2e-16
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 32.17905 34.63700
## sample estimates:
## mean of x mean of y
## 559.4096 526.0015
##
##
## ---------------------------------------------
## T-test results for Hash.rate.MH.s :
##
## Welch Two Sample t-test
##
## data: before_2013[[var]] and after_2013[[var]]
## t = -47.902, df = 3914, p-value < 2.2e-16
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -82443618 -75960296
## sample estimates:
## mean of x mean of y
## 9.051497e+00 7.920197e+07
##
##
## ---------------------------------------------